Goto

Collaborating Authors

 elevation map



Physics-Aware Downsampling with Deep Learning for Scalable Flood Modeling

Neural Information Processing Systems

Floods are the most common natural disaster in the world, affecting the lives of hundreds of millions. Flood forecasting is therefore a vitally important endeavor, typically achieved using physical water flow simulations, which rely on accurate terrain elevation maps. However, such simulations, based on solving partial differential equations, are computationally prohibitive on a large scale. This scalability issue is commonly alleviated using a coarse grid representation of the elevation map, though this representation may distort crucial terrain details, leading to significant inaccuracies in the simulation.\Contributions. We train a deep neural network to perform physics-informed downsampling of the terrain map: we optimize the coarse grid representation of the terrain maps, so that the flood prediction will match the fine grid solution. For the learning process to succeed, we configure a dataset specifically for this task. We demonstrate that with this method, it is possible to achieve a significant reduction in computational cost, while maintaining an accurate solution. A reference implementation accompanies the paper as well as documentation and code for dataset reproduction.



MCTED: A Machine-Learning-Ready Dataset for Digital Elevation Model Generation From Mars Imagery

Osadnik, Rafał, Gómez, Pablo, Bohacek, Eleni, Bahia, Rickbir

arXiv.org Artificial Intelligence

This work presents a new dataset for the Martian digital elevation model prediction task, ready for machine learning applications called MCTED. The dataset has been generated using a comprehensive pipeline designed to process high-resolution Mars orthoimage and DEM pairs from Day et al., yielding a dataset consisting of 80,898 data samples. The source images are data gathered by the Mars Reconnaissance Orbiter using the CTX instrument, providing a very diverse and comprehensive coverage of the Martian surface. Given the complexity of the processing pipelines used in large-scale DEMs, there are often artefacts and missing data points in the original data, for which we developed tools to solve or mitigate their impact. We divide the processed samples into training and validation splits, ensuring samples in both splits cover no mutual areas to avoid data leakage. Every sample in the dataset is represented by the optical image patch, DEM patch, and two mask patches, indicating values that were originally missing or were altered by us. This allows future users of the dataset to handle altered elevation regions as they please. We provide statistical insights of the generated dataset, including the spatial distribution of samples, the distributions of elevation values, slopes and more. Finally, we train a small U-Net architecture on the MCTED dataset and compare its performance to a monocular depth estimation foundation model, DepthAnythingV2, on the task of elevation prediction. We find that even a very small architecture trained on this dataset specifically, beats a zero-shot performance of a depth estimation foundation model like DepthAnythingV2. We make the dataset and code used for its generation completely open source in public repositories.


LT-Exosense: A Vision-centric Multi-session Mapping System for Lifelong Safe Navigation of Exoskeletons

Wang, Jianeng, Mattamala, Matias, Kassab, Christina, Chebrolu, Nived, Burger, Guillaume, Elnecave, Fabio, Petriaux, Marine, Fallon, Maurice

arXiv.org Artificial Intelligence

Figure 1: L T -Exosense is capable of merging multiple sessions generated by a previous work, Exosense, a vision-centric scene understanding system with its sensing unit (T op-Right) integrated into a self-balancing exoskeleton (b). The merged map (a) contains five sessions with colored contours indicating the coverage area of each session. Such a merged map can be further converted into a navigation map, enabling obstacle-free planning spanning multiple sessions. Abstract-- Self-balancing exoskeletons offer a promising mobility solution for individuals with lower-limb disabilities. For reliable long-term operation, these exoskeletons require a perception system that is effective in changing environments. In this work, we introduce L T -Exosense, a vision-centric, multi-session mapping system designed to support long-term (semi)- autonomous navigation for exoskeleton users. L T -Exosense extends single-session mapping capabilities by incrementally fusing spatial knowledge across multiple sessions, detecting environmental changes, and updating a persistent global map. This representation enables intelligent path planning, which can adapt to newly observed obstacles and can recover previous routes when obstructions are removed. We validate L T -Exosense through several real-world experiments, demonstrating a scalable multi-session map that achieves an average point-to-point error below 5 cm when compared to ground-truth laser scans.


DPL: Depth-only Perceptive Humanoid Locomotion via Realistic Depth Synthesis and Cross-Attention Terrain Reconstruction

Sun, Jingkai, Han, Gang, Sun, Pihai, Zhao, Wen, Cao, Jiahang, Wang, Jiaxu, Guo, Yijie, Zhang, Qiang

arXiv.org Artificial Intelligence

Abstract-- Recent advancements in legged robot perceptive locomotion have shown promising progress. However, terrain-aware humanoid locomotion remains largely constrained to two paradigms: depth image-based end-to-end learning and elevation map-based methods. The former suffers from limited training efficiency and a significant sim-to-real gap in depth perception, while the latter depends heavily on multiple vision sensors and localization systems, resulting in latency and reduced robustness. T o overcome these challenges, we propose a novel framework that tightly integrates three key components: (1) T errain-A ware Locomotion Policy with a Blind Backbone, which leverages pre-trained elevation map-based perception to guide reinforcement learning with minimal visual input; (2) Multi-Modality Cross-Attention Transformer, which reconstructs structured terrain representations from noisy depth images; (3) Realistic Depth Images Synthetic Method, which employs self-occlusion-aware ray casting and noise-aware modeling to synthesize realistic depth observations, achieving over 30% reduction in terrain reconstruction error . This combination enables efficient policy training with limited data and hardware resources, while preserving critical terrain features essential for generalization. Humanoid robots offer immense potential for enabling autonomous mobility in human-centric, unstructured environments. Achieving this vision requires the development of perceptive locomotion systems that integrate visual perception and control, enabling real-time gait adaptation to complex terrain.




Parkour in the Wild: Learning a General and Extensible Agile Locomotion Policy Using Multi-expert Distillation and RL Fine-tuning

Rudin, Nikita, He, Junzhe, Aurand, Joshua, Hutter, Marco

arXiv.org Artificial Intelligence

Legged robots are well-suited for navigating terrains inaccessible to wheeled robots, making them ideal for applications in search and rescue or space exploration. However, current control methods often struggle to generalize across diverse, unstructured environments. This paper introduces a novel framework for agile locomotion of legged robots by combining multi-expert distillation with reinforcement learning (RL) fine-tuning to achieve robust generalization. Initially, terrain-specific expert policies are trained to develop specialized locomotion skills. These policies are then distilled into a unified foundation policy via the DAgger algorithm. The distilled policy is subsequently fine-tuned using RL on a broader terrain set, including real-world 3D scans. The framework allows further adaptation to new terrains through repeated fine-tuning. The proposed policy leverages depth images as exteroceptive inputs, enabling robust navigation across diverse, unstructured terrains. Experimental results demonstrate significant performance improvements over existing methods in synthesizing multi-terrain skills into a single controller. Deployment on the ANYmal D robot validates the policy's ability to navigate complex environments with agility and robustness, setting a new benchmark for legged robot locomotion.


Leg Exoskeleton Odometry using a Limited FOV Depth Sensor

Xavier, Fabio Elnecave, Viozelange, Matis, Burger, Guillaume, Pétriaux, Marine, Deschaud, Jean-Emmanuel, Goulette, François

arXiv.org Artificial Intelligence

Leg Exoskeleton Odometry using a Limited FOV Depth Sensor Fabio Elnecave Xavier 1, 2, Matis Viozelange 1, Guillaume Burger 1, Marine P etriaux 1, Jean-Emmanuel Deschaud 2 and Franc ois Goulette 2, 3 Abstract -- For leg exoskeletons to operate effectively in real-world environments, they must be able to perceive and understand the terrain around them. However, unlike other legged robots, exoskeletons face specific constraints on where depth sensors can be mounted due to the presence of a human user . These constraints lead to a limited Field Of View (FOV) and greater sensor motion, making odometry particularly challenging. T o address this, we propose a novel odometry algorithm that integrates proprioceptive data from the exoskeleton with point clouds from a depth camera to produce accurate elevation maps despite these limitations. Our method builds on an extended Kalman filter (EKF) to fuse kinematic and inertial measurements, while incorporating a tailored iterative closest point (ICP) algorithm to register new point clouds with the elevation map. Experimental validation with a leg exoskeleton demonstrates that our approach reduces drift and enhances the quality of elevation maps compared to a purely proprioceptive baseline, while also outperforming a more traditional point cloud map-based variant. I. INTRODUCTION Self-balancing leg exoskeletons enable individuals with motor disabilities to walk completely hands-free. Currently, their use is restricted to hospitals and rehabilitation centers, where they are operated under strict supervision.